Identification of Reliable Information for Classification Problems
نویسندگان
چکیده
A novel information identification model is proposed to support accurate classification tasks with mixtures of categorical and real-valued attributes. This model combines the advantages of rough set theory and cluster validity method to promote the classification quality to the higher levels. Real-valued attribute values are pre-processed by fuzzy c-means clustering method and then analyzed by variable precision rough set theory. Our cluster validity index finalizes the information system with the feasible cluster number for each attribute. In the case that a considerable amount of ambiguous instances is included, the experimental results show that our model can explicitly improve traditional classifiers in the aspects of classification accuracy and discrimination power. This paper provides a better solution for the generation of reliable decision rules for classification problems with attribute mixtures. Key-Words: Reliable information, classification problems, fuzzy c-means, variable precision rough set, cluster validity index, discrimination power
منابع مشابه
Author gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملطبقه بندی و شناسایی رخسارههای زمینشناسی با استفاده از دادههای لرزه نگاری و شبکههای عصبی رقابتی
Geological facies interpretation is essential for reservoir studying. The method of classification and identification seismic traces is a powerful approach for geological facies classification and distinction. Use of neural networks as classifiers is increasing in different sciences like seismic. They are computer efficient and ideal for patterns identification. They can simply learn new algori...
متن کاملPhishing website detection using weighted feature line embedding
The aim of phishing is tracing the users' s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. M...
متن کاملExperience Management in Maintenance and After Sale Service of an Industrial Enterprise
Back ground and Aim: Experience Management and Tacit Knowledge of organization's employees are considered one of the most important capitals of today's leading companies. This study is done in a company which produces manufacturing bending machine for tube and wire. The quality of after-sales service and the performance of the technicians' regarding to the customers are important in this compan...
متن کاملIdentification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کامل